30 research outputs found

    Kin-cohort estimates for familial breast cancer risk in relation to variants in DNA base excision repair, BRCA1 interacting and growth factor genes

    Get PDF
    BACKGROUND: Subtle functional deficiencies in highly conserved DNA repair or growth regulatory processes resulting from polymorphic variation may increase genetic susceptibility to breast cancer. Polymorphisms in DNA repair genes can impact protein function leading to genomic instability facilitated by growth stimulation and increased cancer risk. Thus, 19 single nucleotide polymorphisms (SNPs) in eight genes involved in base excision repair (XRCC1, APEX, POLD1), BRCA1 protein interaction (BRIP1, ZNF350, BRCA2), and growth regulation (TGFß1, IGFBP3) were evaluated. METHODS: Genomic DNA samples were used in Taqman 5'-nuclease assays for most SNPs. Breast cancer risk to ages 50 and 70 were estimated using the kin-cohort method in which genotypes of relatives are inferred based on the known genotype of the index subject and Mendelian inheritance patterns. Family cancer history data was collected from a series of genotyped breast cancer cases (N = 748) identified within a cohort of female US radiologic technologists. Among 2,430 female first-degree relatives of cases, 190 breast cancers were reported. RESULTS: Genotypes associated with increased risk were: XRCC1 R194W (WW and RW vs. RR, cumulative risk up to age 70, risk ratio (RR) = 2.3; 95% CI 1.3–3.8); XRCC1 R399Q (QQ vs. RR, cumulative risk up to age 70, RR = 1.9; 1.1–3.9); and BRIP1 (or BACH1) P919S (SS vs. PP, cumulative risk up to age 50, RR = 6.9; 1.6–29.3). The risk for those heterozygous for BRCA2 N372H and APEX D148E were significantly lower than risks for homozygotes of either allele, and these were the only two results that remained significant after adjusting for multiple comparisons. No associations with breast cancer were observed for: APEX Q51H; XRCC1 R280H; IGFPB3 -202A>C; TGFß1 L10P, P25R, and T263I; BRCA2 N289H and T1915M; BRIP1 -64A>C; and ZNF350 (or ZBRK1) 1845C>T, L66P, R501S, and S472P. CONCLUSION: Some variants in genes within the base-excision repair pathway (XRCC1) and BRCA1 interacting proteins (BRIP1) may play a role as low penetrance breast cancer risk alleles. Previous association studies of breast cancer and BRCA2 N372H and functional observations for APEX D148E ran counter to our findings of decreased risks. Due to the many comparisons, cautious interpretation and replication of these relationships are warranted

    Supplementing High-Density SNP Microarrays for Additional Coverage of Disease-Related Genes: Addiction as a Paradigm

    Get PDF
    Commercial SNP microarrays now provide comprehensive and affordable coverage of the human genome. However, some diseases have biologically relevant genomic regions that may require additional coverage. Addiction, for example, is thought to be influenced by complex interactions among many relevant genes and pathways. We have assembled a list of 486 biologically relevant genes nominated by a panel of experts on addiction. We then added 424 genes that showed evidence of association with addiction phenotypes through mouse QTL mappings and gene co-expression analysis. We demonstrate that there are a substantial number of SNPs in these genes that are not well represented by commercial SNP platforms. We address this problem by introducing a publicly available SNP database for addiction. The database is annotated using numeric prioritization scores indicating the extent of biological relevance. The scores incorporate a number of factors such as SNP/gene functional properties (including synonymy and promoter regions), data from mouse systems genetics and measures of human/mouse evolutionary conservation. We then used HapMap genotyping data to determine if a SNP is tagged by a commercial microarray through linkage disequilibrium. This combination of biological prioritization scores and LD tagging annotation will enable addiction researchers to supplement commercial SNP microarrays to ensure comprehensive coverage of biologically relevant regions

    The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment.

    Get PDF
    OBJECTIVE: Coronavirus disease 2019 (COVID-19) poses societal challenges that require expeditious data and knowledge sharing. Though organizational clinical data are abundant, these are largely inaccessible to outside researchers. Statistical, machine learning, and causal analyses are most successful with large-scale data beyond what is available in any given organization. Here, we introduce the National COVID Cohort Collaborative (N3C), an open science community focused on analyzing patient-level data from many centers. MATERIALS AND METHODS: The Clinical and Translational Science Award Program and scientific community created N3C to overcome technical, regulatory, policy, and governance barriers to sharing and harmonizing individual-level clinical data. We developed solutions to extract, aggregate, and harmonize data across organizations and data models, and created a secure data enclave to enable efficient, transparent, and reproducible collaborative analytics. RESULTS: Organized in inclusive workstreams, we created legal agreements and governance for organizations and researchers; data extraction scripts to identify and ingest positive, negative, and possible COVID-19 cases; a data quality assurance and harmonization pipeline to create a single harmonized dataset; population of the secure data enclave with data, machine learning, and statistical analytics tools; dissemination mechanisms; and a synthetic data pilot to democratize data access. CONCLUSIONS: The N3C has demonstrated that a multisite collaborative learning health network can overcome barriers to rapidly build a scalable infrastructure incorporating multiorganizational clinical data for COVID-19 analytics. We expect this effort to save lives by enabling rapid collaboration among clinicians, researchers, and data scientists to identify treatments and specialized care and thereby reduce the immediate and long-term impacts of COVID-19

    An Ontology-driven Semantic Mash-up of Gene and Biological Pathway Information: Application to the Domain of Nicotine Dependence

    Get PDF
    Objectives: This paper illustrates how Semantic Web technologies (especially RDF, OWL, and SPARQL) can support information integration and make it easy to create semantic mashups (semantically integrated resources). In the context of understanding the genetic basis of nicotine dependence, we integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base. Methods: We use an ontology-driven approach to integrate two gene resources (Entrez Gene and HomoloGene) and three pathway resources (KEGG, Reactome and BioCyc), for five organisms, including humans. We created the Entrez Knowledge Model (EKoM), an information model in OWL for the gene resources, and integrated it with the extant BioPAX ontology designed for pathway resources. The integrated schema is populated with data from the pathway resources, publicly available in BioPAX-compatible format, and gene resources for which a population procedure was created. The SPARQL query language is used to formulate queries over the integrated knowledge base to answer the three biological queries. Results: Simple SPARQL queries could easily identify hub genes, i.e., those genes whose gene products participate in many pathways or interact with many other gene products. The identification of the genes expressed in the brain turned out to be more difficult, due to the lack of a common identification scheme for proteins. Conclusion: Semantic Web technologies provide a valid framework for information integration in the life sciences. Ontology-driven integration represents a flexible, sustainable and extensible solution to the integration of large volumes of information. Additional resources, which enable the creation of mappings between information sources, are required to compensate for heterogeneity across namespaces

    Practice of Epidemiology Candidate Single Nucleotide Polymorphism Selection using Publicly Available Tools: A Guide for Epidemiologists

    No full text
    Single nucleotide polymorphisms (SNPs) are the most common form of human genetic variation, with millions present in the human genome. Because only 1 % might be expected to confer more than modest individual effects in association studies, the selection of predictive candidate variants for complex disease analyses is formidable. Technologic advances in SNP discovery and the ever-changing annotation of the genome have led to massive informational resources that can be difficult to master across disciplines. A simplified guide is needed. Although methods for evaluating nonsynonymous coding SNPs are known, several other publicly available computational tools can be utilized to assess polymorphic variants in noncoding regions. As an example, the authors applied multiple methods to select SNPs in DNA double-strand break repair genes. They chose to evaluate SNPs that occurred among a preexisting set of 57 validated assays and to justify new assay development for 83 potential SNPs in the DNA-dependent protein kinase catalytic subunit. Of the 140 SNPs, the authors eliminated 119 variants with low or neutral predictions. The existing computational methods they used and the semiquantitative relative ranking strategy they developed can be adapted to a priori SNP selection or post hoc evaluation of variant
    corecore